The permutable POMDP: fast solutions to POMDPs for preference elicitation

نویسندگان

  • Finale Doshi-Velez
  • Nicholas Roy
چکیده

The ability for an agent to reason under uncertainty is crucial for many planning applications, since an agent rarely has access to complete, error-free information about its environment. Partially Observable Markov Decision Processes (POMDPs) are a desirable framework in these planning domains because the resulting policies allow the agent to reason about its own uncertainty. In domains with hidden state and noisy observations, POMDPs optimally trade between actions that increase an agent’s knowledge and actions that increase an agent’s reward. Unfortunately, for many real world problems, even approximating good POMDP solutions is computationally intractable without leveraging structure in the problem domain. We show that the structure of many preference elicitation problems—in which the agent must discover some hidden preference or desire from another (usually human) agent—allows the POMDP solution to be solved with exponentially fewer belief points than standard point-based approximations while retaining the quality of the solution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Structured Parameter Elicitation

The behavior of a complex system often depends on parameters whose values are unknown in advance. To operate effectively, an autonomous agent must actively gather information on the parameter values while progressing towards its goal. We call this problem parameter elicitation. Partially observable Markov decision processes (POMDPs) provide a principled framework for such uncertainty planning t...

متن کامل

Three New Algorithms to Solve N-POMDPs

In many fields in computational sustainability, applications of POMDPs are inhibited by the complexity of the optimal solution. One way of delivering simple solutions is to represent the policy with a small number of α-vectors. We would like to find the best possible policy that can be expressed using a fixed number N of α-vectors. We call this the N-POMDP problem. The existing solver α-min app...

متن کامل

Purely Epistemic Markov Decision Processes

Planning under uncertainty involves two distinct sources of uncertainty: uncertainty about the effects of actions and uncertainty about the current state of the world. The most widely developed model that deals with both sources of uncertainty is that of Partially Observable Markov Decision Processes (POMDPs). Simplifying POMDPs by getting rid of the second source of uncertainty leads to the we...

متن کامل

Hierarchical POMDP Decomposition for A Conversational Robot

POMDPs provide a useful framework for decisionmaking in the presence of uncertainty. Finding solutions to large-scale problems, however, has proven computationally infeasible. We propose a hierarchical approach to POMDPs which takes advantage of structure in the domain to decompose the problem into a collection of smaller POMDPs. These can be solved independently, allowing us to solve larger pr...

متن کامل

Applying Metric-Trees to Belief-Point POMDPs

Recent developments in grid-based and point-based approximation algorithms for POMDPs have greatly improved the tractability of POMDP planning. These approaches operate on sets of belief points by individually learning a value function for each point. In reality, belief points exist in a highly-structured metric simplex, but current POMDP algorithms do not exploit this property. This paper pres...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008